深入学习已被利用气候数据的统计侦查。具体地,已经成功地应用于降水估计的二维(2D)卷积神经网络(CNN)。该研究实现了一种三维(3D)CNN,以估计来自3D大气数据的流域规模的每日降水,并将结果与2D CNN的结果进行比较。沿时间方向(3D-CNN-TIME)和垂直方向(3D-CNN-VERT)延伸2D CNN。将这些扩展CNN的降水估计与第2D CNN的降水估计与根均方误差(RMSE),NASH-SUTCLIFFE效率(NSE)和第99百分位RMSE相比。发现3D-CNN-TIME和3D-CNN-VERT与2D CNN相比提高了降水估计的模型精度。3D-CNN-VERT在RMSE和NSE方面提供了培训和测试期间的最佳估计。
translated by 谷歌翻译
由一维卷积神经网络(1D-CNN)和长短期存储器(LSTM)网络组成的架构,该架构被提出为CNNSLSTM,用于在此中进行每小时降雨 - 径流模型学习。在CNNSLTSM中,CNN分量在长时间接收小时气象时间序列数据,然后LSTM组件从1D-CNN和小时气象时间序列数据接收提取的特征以进行短期持续时间。以案例研究为例,CNNSLSTM在日本伊希卡里河流域的每小时降雨径流建模。气象数据集由沉淀,空气温度,蒸发散,和长波辐射组成,用作输入,河流流量用作目标数据。为了评估所提出的CNNSLSTM的性能,将CNNSLSTM的结果与1D-CNN,LSTM的结果进行比较,仅用每小时输入(LSTMWHOUT),1D-CNN和LSTM(CNNPLSTM)的并行架构,以及使用每日的LSTM架构每小时输入数据(LSTMWDPH)。与三个传统架构(1D-CNN,LSTMWHOUL和CNNPLSTM)相比,CNNSLSTM对估计准确度明显改进,最近提出了LSTMWDPH。与观察到的流动相比,测试时段的NSE值的中值为0.455-0.469,用于1d-CNN(基于NCHF = 8,16和32,第一层的特征图的信道的数量CNN),用于CNNPLSTM的0.639-0.656(基于NCHF = 8,16和32),LSTMWHOUR的0.745,LSTMWDPH的0.831,CNNSLSTM为0.865-0.873(基于NCHF = 8,16和32)。此外,所提出的CNNSLSTM将1D-CNN的中值降低50.2%-51.4%,CNPLSTM在37.4%-40.8%,LSTMWHOUR,达27.3%-29.5%,LSTMWDPH为10.6%-13.4%。
translated by 谷歌翻译
本研究调查了深度学习方法可以在输入和输出数据之间识别的关系。作为一个案例研究,选择了通过长期和短期内存(LSTM)网络在雪撬流域中的降雨 - 径流建模。每日沉淀和平均空气温度用作估计日常流量放电的模型输入。在模型培训和验证之后,使用假设输入进行了两个实验模拟,而不是观察到的气象数据,以澄清训练模型对输入的响应。第一个数值实验表明,即使没有输入沉淀,训练有素的模型产生流量放电,特别是冬季低流量和高流量在融雪期间。在没有沉淀的情况下,还通过训练模型复制了暖和较冷的条件对流动放电的影响。此外,该模型仅反映了在总年流量放电的积雪期间的总降水量的17-39%,揭示了强烈缺乏水量保护。本研究的结果表明,深度学习方法可能无法正确学习输入和目标变量之间的显式物理关系,尽管它们仍然能够保持强大的拟合效果。
translated by 谷歌翻译
本研究提出了两种直接但有效的方法,以减少通过使用多时间级时间序列数据作为输入通过经常性神经网络(RNN)来计算时间序列建模所需的计算时间。一种方法并行地提供输入时间序列的粗略和精细时间分辨率至RNN。在将它们视为RNN的输入之前,另一个将输入时间序列数据的粗略和精细时间分辨率连接在一起。在这两种方法中,首先,利用更精细的时间分辨率数据来学习目标数据的精细时间尺度行为。接下来,预期较粗糙的时间分辨率数据将捕获输入和目标变量之间的长时间依赖性。通过采用长期和短期记忆(LSTM)网络,在雪撬流域实施时,为每小时降雨 - 径流建模实施,这是一种新型的RNN。随后,使用每日和每小时的气象数据作为输入,并将每小时流量放电视为目标数据。结果证实,两种拟议方法都可以显着降低RNN培训的计算时间(高达32.4次)。此外,提出的方法之一提高了估计准确性。
translated by 谷歌翻译
Agents that can follow language instructions are expected to be useful in a variety of situations such as navigation. However, training neural network-based agents requires numerous paired trajectories and languages. This paper proposes using multimodal generative models for semi-supervised learning in the instruction following tasks. The models learn a shared representation of the paired data, and enable semi-supervised learning by reconstructing unpaired data through the representation. Key challenges in applying the models to sequence-to-sequence tasks including instruction following are learning a shared representation of variable-length mulitimodal data and incorporating attention mechanisms. To address the problems, this paper proposes a novel network architecture to absorb the difference in the sequence lengths of the multimodal data. In addition, to further improve the performance, this paper shows how to incorporate the generative model-based approach with an existing semi-supervised method called a speaker-follower model, and proposes a regularization term that improves inference using unpaired trajectories. Experiments on BabyAI and Room-to-Room (R2R) environments show that the proposed method improves the performance of instruction following by leveraging unpaired data, and improves the performance of the speaker-follower model by 2\% to 4\% in R2R.
translated by 谷歌翻译
This paper proposes a novel sequence-to-sequence (seq2seq) model with a musical note position-aware attention mechanism for singing voice synthesis (SVS). A seq2seq modeling approach that can simultaneously perform acoustic and temporal modeling is attractive. However, due to the difficulty of the temporal modeling of singing voices, many recent SVS systems with an encoder-decoder-based model still rely on explicitly on duration information generated by additional modules. Although some studies perform simultaneous modeling using seq2seq models with an attention mechanism, they have insufficient robustness against temporal modeling. The proposed attention mechanism is designed to estimate the attention weights by considering the rhythm given by the musical score. Furthermore, several techniques are also introduced to improve the modeling performance of the singing voice. Experimental results indicated that the proposed model is effective in terms of both naturalness and robustness of timing.
translated by 谷歌翻译
Bayesian optimization~(BO) is often used for accelerator tuning due to its high sample efficiency. However, the computational scalability of training over large data-set can be problematic and the adoption of historical data in a computationally efficient way is not trivial. Here, we exploit a neural network model trained over historical data as a prior mean of BO for FRIB Front-End tuning.
translated by 谷歌翻译
Our team, Hibikino-Musashi@Home (the shortened name is HMA), was founded in 2010. It is based in the Kitakyushu Science and Research Park, Japan. We have participated in the RoboCup@Home Japan open competition open platform league every year since 2010. Moreover, we participated in the RoboCup 2017 Nagoya as open platform league and domestic standard platform league teams. Currently, the Hibikino-Musashi@Home team has 20 members from seven different laboratories based in the Kyushu Institute of Technology. In this paper, we introduce the activities of our team and the technologies.
translated by 谷歌翻译
图神经网络(GNN)已成为与图形和类似拓扑数据结构有关的无数任务的骨干。尽管已经在与节点和图形分类/回归任务有关的域中建立了许多作品,但它们主要处理单个任务。在图形上的持续学习在很大程度上没有探索,现有的图形持续学习方法仅限于任务的学习方案。本文提出了一个持续学习策略,该策略结合了基于架构和基于内存的方法。结构学习策略是由强化学习驱动的,在该学习中,对控制器网络进行了这种方式,以确定观察到新任务时从基本网络中添加/修剪的最佳节点,从而确保足够的网络能力。参数学习策略的基础是黑暗体验重播方法的概念,以应对灾难性的遗忘问题。我们的方法在任务收入学习和课堂学习设置中都通过几个图的连续学习基准问题进行了数值验证。与最近发表的作品相比,我们的方法在这两种设置中都表明了性能的提高。可以在\ url {https://github.com/codexhammer/gcl}上找到实现代码。
translated by 谷歌翻译
我们提出XMEM,这是一种由Atkinson-Shiffrin Memory模型启发的统一功能存储器存储的长视频的视频对象分割体系结构。视频对象分割的先前工作通常仅使用一种类型的功能内存。对于超过一分钟的视频,单个功能内存模型紧密地链接了内存消耗和准确性。相比之下,遵循Atkinson-Shiffrin模型,我们开发了一种结构,该体系结构结合了多个独立但深厚的特征记忆存储:快速更新的感觉存储器,高分辨率的工作记忆和紧凑的长期记忆。至关重要的是,我们开发了一种记忆增强算法,该算法通常将主动使用的工作记忆元素合并为长期记忆,从而避免记忆爆炸并最大程度地减少长期预测的性能衰减。结合新的记忆阅读机制,XMEM在与最先进的方法(不适用于长视频上使用)相当的长视频时,XMEM大大超过了长效数据集上的最先进性能数据集。代码可从https://hkchengrex.github.io/xmem获得
translated by 谷歌翻译